Adaptive Checkpointing Schemes for Fault Tolerance in Real-Time Systems with Task Duplication

نویسندگان

  • Zhongwen Li
  • Hong Chen
چکیده

Dynamic adaptation techniques based on checkpointing is studied in this paper. Placing store-checkpoints and compare-checkpoints between CSCP (store-and-compare-checkpoint), we first present adaptive checkpointing schemes in which the checkpointing interval for a task is dynamically adjusted on line. Introducing the overheads of comparison and storage, the average execution times to complete a task for proposed schemes are obtained, using renewal equations. Further, we have discussed analytically the optimal numbers of checkpoints that minimize the average execution times. We then extend proposed schemes to a set of multiple tasks in real-time systems. Simulation results show that compared to previous method, the proposed approach significantly increases the likelihood of timely task completion.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of checkpointing for schedulability of real-time systems

Checkpointing is a relatively cost effective method for achieving fault tolerance in real-time systems. Since checkpointing schemes depend on time redundancy, they could affect the correctness of the system by causing deadlines to be missed. This paper provides exact schedulability tests for fault tolerant task sets under specified failure hypothesis and employing checkpointing to assist in fau...

متن کامل

Stability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid

Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...

متن کامل

Analysis of Checkpointing Schemes for Multiprocessor Systems

Parallel computing systems provide hardware redundancy that helps t o achieve low cost fault-tolerance, by duplicating the task into more than a single processor, and comparing the states of the processors a t checkpoints. This paper suggests a novel technique, based on a Markov Reward Model (MRM) , f o r analyzing the performance of checkpointing schemes with task duplication. W e show how thi...

متن کامل

Fault Recovery Based on Checkpointing for Hard Real-Time Embedded Systems

Safety-critical embedded systems often operate in harsh environmental conditions that necessitate fault-tolerant computing techniques. Many safety-critical systems also execute realtime applications. The correctness of these systems depends not only on the logical result of computation, but also on the time at which the results are produced. The missing of task deadlines can therefore be viewed...

متن کامل

Adaptive Checkpointing

Checkpointing is a typical approach to tolerate failures in today’s supercomputing clusters and computational grids. Checkpoint data can be saved either in central stable storage, or in processor memory (as in diskless checkpointing), or local disk space (replacing memory with local disk in diskless checkpointing). But where to save the checkpoint data has a great impact on the performance of a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006